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@ Video signals compression/decompression device for video disk recording/reproducing apparatus. < 
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@ An apparatus for generating compressed signals, 
which can sen/e to record data with high efficiency, 
to manage data easily, to reproduce programs in a 
special manner, to search for data at high speed, 
and to achieve accurate image-speech synchroniza- 
tion. The apparatus comprises a video-data grouping 
device (103) for processing video data into groups, 
each consisting of a predetermined number of video 
data items corresponding to image frames, a video- 
data compressing device (106) for compressing and 
encoding the video data items of each group, an 
audio-data grouping device (102) for processing 
audio data corresponding to the video data into 
groups, each group consisting of audio data items, 
an audio-data compressing device (105) for com- 
pressing and encoding the audio data items of each 
group, a sub-video data grouping device (104) for 
processing sub-video data into groups, each consist- 
ing of sub-video data items, a sub-video data com- 
pressing device (107) for compressing and encoding, 
the sub-video data items of each group, a formatter 
(108) for combining the groups of compressed video 
data items, the groups of compressed audio data 



items and the groups of compressed sub-video data 
items, thereby generating a data unit, data-separat- 
ing device (121) for separating the compressed data 
items, a speech decoder (122) for decoding the 
encoded video data items, an image decoder (123) 
for decoding the encoded video data items, a sub- 
image decoder (124) for decoding the encoded sub- 
video data items, and a synthesizing device com- 
bines the decoded video data items with the de- 
coded sub-video data items. 
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The present invention relates to a disk such as 
a writable and readable nnagnetic disk or a writable 
and readable optical disk, and also to an apparatus 
for processing compressed video signals, which 
can effectively operate in a recording/reproducing 
apparatus using a magnetic disk, an optical disk or 
a CD-ROM as a recording medium. 

In recent years, magnetic disks and optical 
disks have been used in various information sys- 
tems as recording media for storing a great amount 
of data. A number or programs can now be stored 
on a magnetic disk or an optical disk, thanks to the 
recent progress in the technology of code-com- 
pressing video data at a high ratio. Various sys- 
tems for compressing moving-picture data are 
known. One example is the system which is de- 
fined in ISO-t1172 (MPEG). 

It is expected that recording/reproducing ap- 
paratuses using disks as recording media will be 
used in increasing numbers. 

To record more data on a disk, it is desirable 
not only to record signals which have been gen- 
erated by compression-coding or variable-length 
coding by means of the moving-picture compres- 
sion, but also to increase the recording density of 
the data-recording region of a disk. There is a 
demand for technique of reproducing such signals 
from a disk in a special manner and technique of 
searching them at high speed. When a video signal 
is coded by moving-picture compression, the video 
signal and an audio signal which constitute the 
video signal must be synchronized. 

If a disk is damaged or stained with dirt, a 
reproducing apparatus will not be able to read the 
important information from the disk. Once the table 
of information for managing the programs recorded 
on the disk has been destroyed, the apparatus can 
no longer reproduce the important information from 
the disk. 

Accordingly, the first object of the present in- 
vention is to provide an apparatus for processing 
compressed video signals, which can record data 
efficiently, which can easily manage data, which 
can reproduce programs in a special manner and 
search them at high speed, and which can syn- 
chronize a video signal and an audio signal by 
using simple means. 

The second object of the present invention is 
provide a disk which can minimizing the possibility 
that important information to be used in the appara- 
tus is destroyed completely when the disk is 
damaged. 

According to a first aspect of the invention, 
there is provided an apparatus for generating com- 
pressed signals, which comprises: image group- 
ing/compressing means for processing video data 
into groups, each consisting of video data items or 
frames which correspond to a predetermined re- 



producing time, and for compressing and encoding 
the video data items of each group; speech group- 
ing/compressing means for processing audio data 
corresponding to the video data into groups, each 
5 group consisting of audio data items, and for com- 
pressing and encoding the audio data items of 
each group; and a formatter for combining a plural- 
ity of groups of compressed and encoded video 
data items, which have been supplied from the 
70 image grouping/compressing means, into a video- 
data packet, for combining a plurality of groups of 
compressed and encoded audio data items, which 
have been supplied from the speech-group- 
ing/compressing means, into an audio-data packet, 

75 for combining at least the video-data packet and 
the audio-data packet into a data unit, and for 
supplying the data unit to a recording system or a 
transfer system. 

. According to a second aspect of this invention, 

20 there is provided an apparatus for reproducing 
compressed signals from a data unit (OUT) com- 
prised of video data groups each consisting of 
compressed and encoded video data items or 
frames which correspond to a predetermined re- 

25 producing time, audio data groups corresponding 
to the video data groups, each consisting of com- 
pressed and encoded audio data items, sub-video 
data groups corresponding to the video data 
groups, each consisting of compressed and en- 

30 coded sub-video data items. This apparatus com- 
prises: a speech decoder for separating the audio 
data groups from the data unit and decoding the 
audio data groups, thereby generating decoded 
audio data; an image decoder for separating the 

35 video data groups from the data unit and decoding 
the video data groups, thereby generating decoded 
video data; a sub-image decoder for separating the 
sub-video data from the data unit and decoding the 
sub-video data, thereby generating decoded sub- 

40 video data; and data synthesizing means for com- 
bining decoded the decoded video data generated 
by the image decoder and the decoded sub-video 
data generated by the sub-image decoder. 

According to a third aspect of the present 

45 invention, there is provided an apparatus for man- 
aging compressed signals, which comprises mem- 
ory means for storing program information record- 
ed on a recording medium, the program informa- 
tion forming a data allocation table which consists 

50 of numbers assigned to tracks on the recording 
medium, numbers assigned to zones forming each 
track, numbers assigned to sectors" forming each 
track, and a link pointer of a data unit to be 
reproduced next. 

55 According to a fourth aspect of the invention, 

there is provided an apparatus for synchronizing 
compressed signals, which comprises an encoder 
section and a decoder section. The encoder sec- 
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tion comprises: image grouping/compressing 
means for encoding a predetermined number of 
image frames which corresponds to a predeter- 
mined reproducing time of an original image, there- 
by generating encoded video data items, and for 5 
combining the encoded video data items into a 
video packet; speech grouping/compressing means 
for processing encoded audio data con-esponding 
the packet of encoded video data items, thereby 
generating speech frames, and for combining the io 
speech frames into a audio packet; additional data 
generating means for generating additional addi- 
tional data consisting of a speech frame number 
assigned to that speech frame included in the 
audio packet which represents an original speech 15 
corresponding to a start timing of a specified image 
frame included in the video packet; and a formatter 
for combining the additional data, the audio packet 
and the video packet into a data unit. The decoder 
section comprises: decoding means for decoding 20 
the encoded video data, encoded audio data and 
additional data of each data unit; and output timing 
setting means for setting timing of outputting a first 
specified image frame, when a speech frame num- 
ber contained in the encoded audio data coincides 25 
with a speech frame number contained in the addi- 
tion data. 

According to a fifth aspect of the invention, 
there is provided a disk structure having a manage- 
ment area on a central portion and a data area 30 
surrounding the management areas, wherein iden- 
tical management data items are recorded in the 
management areas, data to be accessed based on 
the management data item is recorded in the data 
area, and starting positions of the identical man- 35 
agement data items are set on different radial lines 
spaced apart by different angles. 

This invention can be more fully understood 
from the following detailed description when taken 
in conjunction with the accompanying drawings, in 40 
which: 

FIG. 1A is a diagram schematically representing 
code data generated in a first embodiment of 
the invention; 

FIG. 1B is a diagram schematically showing an 45 
output image obtained by decoding the coded 
data shown in FIG. 1 A; 

FIG. 1C is a diagram illustrating the format of 
compressed signals generated in the first em- 
bodiment of the invention; so 
FIG. 2A is a diagram schematically representing 
code data generated in second embodiment of 
the invention; 

FIG, 2B is a diagram schematically showing an 
output image obtained by decoding the coded 55 
data shown in FIG. 2A; 

FIG. 20 is a diagram illustrating the format of 
compressed signals generated in the second 



embodiment of the invention; 
FIG. 3A is a diagram schematically representing 
code data generated in a third embodiment of 
the invention; 

FIG. 3B is a diagram schematically showing an 
output image obtained by decoding the coded 
data shown in FIG. 3A; 

FIG. 3C is a diagram illustrating the format of 
compressed signals generated in the third em- 
bodiment of the invention; 
FIG. 4A is a diagram showing the management 
area and data area of a disk according to the 
present invention; 

FIG. 48 shows the data unit allocation table 

(DAT) on the disk shown in FIG. 4A; 

FIG. 5A shows the management table recorded 

in the management area of the disk; 

FIG. 58 is a table showing the contents of 16 

bytes in the program information field (PIF) on 

the disk; 

FIG. 5C is a table showing the structure of the 
DAT; 

FIG. 6A is a diagram representing the address 
arrangement of the management table shown in 
FIG. 5A, particularly the address arrangement of 
the DAT; 

FIG. 68 is a diagram showing an example ad- 
dress arrangement which the management table 
may assume; 

FIG. 7 is a block diagram showing the first 
embodiment of the invention; 
FIG. 8A is a diagram showing the format of 
encoded video data; 

FIG. 88 is a diagram representing the format of 
encoded audio data; 

FIG. 8C is a diagram showing the format of 
encoded additional data; 

FIG. 9"iis a block diagram showing an example 
of the encoder incorporated in the system for 
processing the data units shown in FIGS. 8A, 88 
and 8C; 

FIG. 10 is a block diagram showing an example 
of the decoder incorporated in the system for 
processing the data units shown in FIGS. 8A, 88 
and 8C; 

FIG. 11 is a block diagram showing another 
example of the decoder incorporated in the sys- 
tem tor processing the data units shown in 
FIGS. 8A. 88 and 8C; 

FIG. 12 is a block diagram illustrating a record- 
ing/reproducing apparatus which is a second 
embodiment of the present invention; 
FIG. 13 is a block diagram showing the data- 
string processing section of the apparatus 
shown in FIG.. 12; 

FIG. 14 is a table showing the structure of the 
header section of a data unit; 
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FIG. 15A is a perspective view of the disk ac- 
cording to the present Invention; 
FK3. 15B is a diagram illustrating the spiral track 
formed on the disk; 

FIG. 16 is a diagram showing the contents of 
data unit DUT #0 recorded in the data area of 
the disk; 

FIG. 17A illustrates the table recorded in the 
volume identity field (V!D)-on the disk and show- 
ing the correspondence between description 
codes and language codes; 
FIG. 17B shows the table recorded in the PIF on 
the disk; ^ 

FIG. 17C is a table showing the meaning of 
each description code; and 
FIG. 18 is a flow chart explaining the operation 
of the data-string processing section of- the ap- 
paratus shown in FIG. 12. 
Embodiments of the present invention will now 

be described, with reference to the accompanying 

drawings. 

The moving-picture compression format used 
in the present invention will be first explained. To 
encode video data, groups of pictures (GOPs) are 
combined, forming a packet, and audio data (for 
approximately 1.0 second) and expansion data, 
both for the packet, are encoded. The data thus 
encoded is added to the compressed video data, 
forming a data unit. Each GOP is fixed in the same 
program. A speech synchronizing time code is 
arranged as the header (i.e., the first part of the 
data unit), and sub-video data is arranged next to 
the header. 

FIG. 1A shows an example of the encoded 
data, and FIG. 1 B shows the output image obtained 
by decoding the encoded data. In FIGS, 1A and 
1 8, I indicates the video data encoded in a frame, 
P the video data encoded by fon/vard prediction, 
and 8 the video data encoded by bidirectional 
prediction. In this mode, the components I. P, B, P 
and B of the video data are encoded repeatedly in 
the order they are mentioned. As a result, the 
length of the encoded data differs, from frame to 
frame. With such a format, reproducing only I pro- 
vides a sextuple-speed image, and reproducing I 
and P generates a double-speed image. The actual 
multiple speed is limited by the speed at which the 
data is read from a disk. This format is suitable for 
high-speed transfer rate, a large recording capac- 
ity, and semi-random access. In this example, as 
shown in FIG. 1C, six frames form a GOP; and five 
GOPs form a packet. !t takes one second to repro- 
duce this packet from the disk. The actual length of 
recorded signals on the disk differs from packet to 
packet, since the signals are encoded by moving- 
picture compression techniques. 

Therefore, a packet consists of 30 frames ( = 5 
GOPs X 6 frames/GOP). Each set of 30 frames of 
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audio data is recorded in 48K bytes (= 4 ch x 12 
K bytes/s). In the case where two channels are 
used simultaneously, the minimum memory capac- 
ity required is only 24K bytes. 
5 The primary data item and the data rate for 

each data unit to be recorded on the disk are as 
follows: 

Expansion data = 128Kbits/s = 16K bytes/s 
Audio data = 384K bits/s = 48K bytes/s 

10 Image data = 4096K bits/s = 51 2K bytes/s 

The expansion data contains a header and 
sub-video data. The sub-video data can be used 
■ as, for example, subtitle data used in a movie. The 
header is individual management information in the 

75 data unit and contains image-speech synchronizing 
data. The sub-video data is updated in units of 
GOPs containing the corresponding main image. 
The image and speech are also synchronized in 
units of GOPs, and the synchronization is corrected 

20 in units of GOPs, too. 

For subtitle data, a plurality of channels may 
be provided for the. sub-video data so that two 
types of sub-images can be output as a English 
scenario and a Japanese subtitles on a foreign film 

25 can. If the allocated rate of the sub-video data is 
64K bits/s, and if the recording time of one packet 
is 1.0 second, the buffer memory capacity for 
holding the sub-video data will be approximately 
64K bits. The buffer memory capacity needed for 

30 two channels of sub-image may be 32K bits. 

Once the video data, the audio data, and the 
expansion data have been encoded, they are com- 
pleted within the data unit and are totally indepen- 
dent from other data units. 

35 On the disk there is provided a management 

area. Each data unit is read, in accordance with the 
data recorded in the management area. Since each 
data unit is processed independently of any other 
data unit, it can be easily edited and accessed. 

40 The relationship between the data area and the 

associated management information will be de- 
scribed. 

In the actual layout, a byte align process is 
performed for each GOP. and a sector align pro- 

45 cess is always carried out for each data unit to 
make it easy to segment the data unit. Due to the 
sector align process performed, the actual record- 
ing capacity of the disk is reduced. In the case 
where the display frame rate is 30 frame/sec, each 

50 GOP consists of six pictures (frames), and each 
data unit consists of five GOPs. sector align pro- 
cess is performed for every data which corre- 
sponds to 1.0 second of a program. Therefore, a 
disk recording a 120-minute program has its re- 

55 cording capacity reduced by 7200 sectors. This 
reduction is 0.2% for a disk whose total recording 
capacity is 346,752 sectors each capable of storing 
1 KB of data, 

4 
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In a reproducing operation, the image is de- 
coded, beginning with the first frame (1 picture) of 
the GOP. The speech is decoded, beginning with 
the speech frame. specified by the image-speech 
synchronization. At the time when the decoding of 
both of the specified speech frame and the start 
frame of the image GOP has been completed, the 
image and the specified speech sample start to be 
outputted simultaneously. 

For audio data, approximately 1.0 second of 
encoded audio data is inserted in the data unit. 
After a certain number of samples are grouped into 
a block, with the adjacent block edges tucked in a 
bit, the speech is encoded in units of this number 
of samples, and a header is added to the encoded 
speech thereby to form an encoded speech frame. 

The length of speech frame is less than the 
length of 2048 samples of the original speech, and 
corresponds to 24 ms to 36 ms in terns of the 
duration of the original speech. The encoded data 
amount of the speech frame ranges from 288 bytes 
to 576 bytes. A frame ID is added to the header of 
each speech frame in each speech channel. The 
frame ID is made up of 24 bits, 4 bits of which 
represent a speech channel and 20 bits indicate a 
speech frame number. The approximately 1 .0 sec- 
ond of audio data is usually as long as several tens 
of speech frames, though the length varies with the 
number of samples in a block and the sampling 
frequency. The image synchronization specifies the 
frame number of the encoded speech to which the 
decoded speech sample to be outputted with the 
timing of outputting the start frame of the cor- 
responding GOP, and the speech sample number 
in the frame. The time code consists of 32 bits. 20 
bits of which represent a speech frame number 
and the remaining 12 bits of which specify a 
speech sample number. This enables the maxi- 
mum error in the speech and image synchroniza- 
tion in the entire system to coincide with half the 
sampling period of speech. When fs = 32 KHz, the 
maximum speech synchronization error is approxi- 
mately 16 us. 

FIGS. 2A to 20 shows an another example of a 
moving-picture compression format, and FIGS. 3A 
to 3C still another example of a moving-picture 
format. 

The management information recorded in the 
management area will be explained below. The 
management data is recorded in the form of a 
table. 

In the embodiments described above, each 
data unit consists of two or more GOPs. Instead, 
according to the present invention, each data unit 
may contain only one GOP. 

As shown in FIG. 4A, the management table 
contains a volume identity field (VI D) around the 
innermost track, a picture information field (PIF) 
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surrounding the VID, and a data unit allocation 
table (DAT) surrounding the PIF. The VID is written, 
starting at the first byte in the management table 
area, and indicates information on various elements 

5 . throughout the disk by using 256 bytes. For exam- 
ple, this information includes data as to whether the 
disk is for general recording or for reproduction 
only, in the picture information field (PIF), various 
pieces of data on each program are recorded. For 

70 example, 16 bytes are used for each program. 

FIG. 5B shows an example of the contents of 
16 bytes stored in the PIF. 

ATMB is the absolute time of the starting point 
of the present program in the volume. In the case 

7 5 of time code search, each item of ATMB data is 
checked in the order of reproducing programs to 
find the number of the program in which a desired 
time code is present. Each DAT (to be described 
later) in the corresponding program is checked. 

20 Then, the sum of the program time (PTMB, to be 
described later) and the ATMB is compared with 
the desired time code value to find the DAT to 
which the corresponding time code belongs. In this 
procedure, searching can be effected: By the 

25 method based on the absolute starting time, the 
user can know the absolute starting time from the 
desired program and can, therefore, obtain a spe- 
cific item of PIF data by searching for the ATMB 
corresponding to the absolute starting time. 

30 PINF indicates program attributes which are 

allocated to each program. Among the program 
attributes are a copy disable flag (CPNH), a pro- 
gram type (PTYPE), a write attribute (PWRT), and 
the number of GOPs forming a data unit (SGDU), If 

35 the CPNH is set at 1, it means copy disable and if 
it is set at 0, it means copy enable. The PTYPE, 
which consists of three bits, indicates such types 
as the home video, movie, music, kara-OK, com- 
puter graphics, interactive use, game, computer 

40 data, or program. When the PWRT has a value of 
1 , it means write enable. 

The PiF aiso includes the parameters as shown 
in FIG. 5B. in which AINF identifies a speech 
encoding system, VI NF denotes the identification of 

45 an image encoding system, ATRT represents the 
picture attributes (i.e., data for identifying the as- 
pect ratio and a system such as the PAL or the 
NTSC system), and HRES and VRES indicate the 
data on horizontal resolution and vertical resolution, 

50 respectively. 

PNTB indicates a start pointer that has a value 
indicating the DAT address (data unit number) at 
which the data unit at the program starting point is 
stored. Once the DAT address (data unit number) 

55 has been determined, it is possible to identify the 
position of the start sector of a program on the data 
area. 
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PGML. indicates the program number to be 
processed immediately after the current program is 
finished, ' when related programs are present. 
Namely, the order in which programs are produced 
does not necessarily coincide with the order of 
programs numbers. When the current program is - 
the last program, there is no link destination and all 
bits of the PGML are "1".^ 

FIG. 5C shows the structure of the DAT. In this 
table, there are such parameters as a zone number 
(NZON), a sector number (NSTC), and a track 
number (NTRC) on a disk, as well as a program 
time (PTMB), and a link pointer (PNTL). 

NZON is the zone number-to which the record- 
ing sector at the start of the data unit belongs. The 
disk is divided in units of tracks in the radial 
direction, from the innermost circumference, and 
the zone numbers are allocated in sequence. Spe- 
cifically, as shown in FIG. 4A. the data area has a 
reference position R1 on the disk and the number 
begins with 0 at this position. NSTC indicates a 
sector number in a zone. The sector number is not 
a serial number associated with another track or 
zone but a number complete only in the track or 
zone. NTRC indicates the number of the track in 
which the zone and the sector number (the header 
of the data unit) exist. PTMB is a flag representing 
the time position data on the video data (1 picture) 
at the start of the data unit. The position data 
indicates a time (in seconds) elapsed from the 
program starting point. The time position data is 
used in searching for time codes explained earlier. 
Further, the time position data is taken in the 
reproducing apparatus, which uses it as the start 
reference data in order to display the program 
time, absolute time, remaining time, etc. 

PNTL is a flag showing a subsequent data unit 
immediately following the present DAT unit number 
in time. The unit corresponds to the data unit- 
number. When there is no link destination at the 
program end, all bits are set at 1 (= 0 x FFFF). 
The effective value for the link pointer ranges from 
0 X 0000 to 0 X FFFF. 

FIG, 48 graphically shows the management 
area and data area. The blocks in the data area 
each indicate programs. The DAT unit numbers are 
continuous in this order: 0 to Nmax. The first DAT 
unit number is determined by referring to the 
PNTB in the PIF. If the DAT unit number is 1, then 
the next link pointer will be 0. The link pointer of 
DAT unit number 0 is Nmax - i . The link pointer of 
DAT unit number Nmax -1 is 2. By checking for the 
zone number, the sector number, and the track 
number according to the change of the DAT unit 
number, it is possible to obtain data on the re- 
production order such as track 4 in sector 3 in 
zone 1, track 7 in sector 2 in zone 0, and track 10 
in sector 30 in zone 3. 
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FIG. 6A represents the address arrangement of 
the management table shown in FIG. 5A, particu- 
larly the address arrangement of the DAT, 

FIG. '6B shows another address arrangement 
5 which the management table may assume and in 
which fields not used are provided among the VID, 
the PIF and the DAT. In the address arrangement 
of FIG. 68, an address offset will occur when the 
data search is switched from the VID to the PIF, 

70 The offset data Is contained in the data recorded in 
the VID and wilhbe recognized when a drive control 
MPU executes an address management program. 

The recording capacity of the management ta- 
ble will be calculated. 

75 The capacity for recording the management 

table depends on the number of programs and the 
number of data units which are recorded on the 
disk. Assuming that 256 programs and 7200 data 
units (1 sec/unit, corresponding to 2 hours), the 

20 data for the management table amounts to 61952 
bytes (= 256 + (16 X 256) + (8 x 7200). Namely, 
in a system wherein a data unit corresponds to 
about 1 second, management information for 2 
hours can be recorded in a 63KB memory. In other 

25 words, a 63KB memory is practically sufficient for 
storing the entire management table. 

The physical position of the start sector of the - 
management table is usually defined by ZONE = 
0, TRACK = 0 and SECTOR = 0. To protect data, 

30 a plurality of management tables may be recorded 
in different physical regions. The management tci- 
ble is frequently referred to. It takes much time to 
access to the table recorded on the disk. To re- 
duce the access time, the management table may 

35 be mapped in the work RAM incorporated in the 
drive control MPU. However, the memory cost will 
be too much for the apparatus cost if the table is 
excessively large, and a great number of oper- 
ations must be performed to convert the manage- 

40 ment table into desired parameters if the manage- 
ment table is not appropriately formulated, in view 
of this it is desirable to set the system of the 
apparatus in accordance with the apparatus cost 
and the amount of the table. 

45 FIG. 7 shows the encoder and decoder incor- 

porated in is a block diagram showing an apparatus 
for processing compressed video signals, which is 
first embodiment of the invention. In operation, an 
original signal is input to an input terminal 100 and 

50 hence to signal separating means 101. The signal 
separating means 101 separates the original signal 
into audio data, video data, expansion data (e.g., 
subtile data), a sync signal, and the like. The audio 
data is input to speech-data grouping means 102. 

55 the video data to image-data grouping means 103, 
the expansion data to expansion-data grouping 
means 104, and the sync signal to first system 
control means 110. While being set in mode 1, the 

6 
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first system control means 110 controls the image- 
data grouping means i03 such that the means 103 
forms groups of video data, each consisting of six 
frames, conlrols the speech-data grouping means 
102 such thai the means 102 forms groups of 
audio data in units of time of mode 1, and controls 
the expansion-data grouping means 104 such that 
the means 104 forms groups of expansion data 
which correspond to the frames. The groups of 
video data are input to image-data compressing 
means 106, which encodes and compresses the 
video data in the way explained with reference to 
FIGS. 1 A. 1 B and IC. The groups of audio data are 
input to speech-data compressing means 105, 
which encodes and compresses the audio data. 
The groups of expansion data are input to expan- 
sion-data compressing means 107, which encodes 
and compresses the expansion data. The data out- 
put from the data compressing means 105, 106 
and 107 are input to a formatter 108. The formatter 
108 collects five GOPs (i.e., groups of encoded 
picture data items), thereby forming a data unit of 
the type shown in FIG. 1A. The data unit consists 
of encoded audio data, encoded expansion data 
and a header (i.e.. additional data). Each data com- 
pressing means is controlled so as to generate, 
encoded data the amount of which is an integral 
multiple of the maximum amount of data that can 
be recorded in one sector of a recording medium. 

Data units output from the formatter 108 are 
recorded on the recording medium or supplied to a 
data transfer system. The signal are read from the 
recording medium or transferred from the data 
transfer system and then supplied to signal sepa- 
rating means 121. The signal separating means 
121 extracts the encoded audio data, the encoded 
video data, the encoded expansion data and the 
header from each data unit. The encoded audio 
data is-supplied to a speech decoder 122, which 
decodes the data, thereby reproducing an audio 
signal. The encoded video data is supplied to an 
image decoder 123 and decoded. The encoded 
expansion data is supplied to an expansion data 
encoder 124 and decoded. The decoded video 
data and the decoded expansion data are supplied 
to data synthesizing means 125, which synthesizes 
the video data and the expansion data, thereby 
reproducing a video signal. The data contained in 
the header is input to second system control 
means 126 and used to generate timing signals 
and to achieve image-speech synchronization and 
mode-setting. - 

The apparatus shown in FIG. 7 is characterized 
by specific means of achieving image-speech syn- 
chronization. 

The data unit wit! be described again, in great- 
er detail. 



As has been described above, one packet of 
video data consists of 30 frames (= 5 GOPs x 6 
frames/GOP), and 30 frames of audio data, forming 
one set. are recorded in 48K bytes (= 4 ch x 12 K 
5 bytes/s), while the apparatus is being set in mode 
1 . When two channels are used simultaneously, the 
minimum memory capacity required is only 24K 
bytes. 

FIGS. 8A, 8B and 8C show the format of the 

70 encoded video data, the format of encoded audio 
data and the format of encoded additional data, 
' respectively. The audio data has been encoded at 
a predetermined sampling frequency, and a pre- 
scribed number of sampled segments of data form 

15 a data block. A speech header is added to the data 
block, whereby the data block and the speech 
header constitute one frame. The speech header 
contains an frame ID which identifies the frame. 
The header of the data unit contains additional 

20 data. The additional data includes data represent- 
ing the relationship between the encoded video 
data and the encoded audio data. More specifi- 
cally, the encoded video data contains an image 
frame number as shown in FIG. 8A, and the en- 

25 coded audio data contains an speech frame num- 
ber as illustrated in FIG. 88. As shown in FIG. 8A, 
the first frame of the first GOPO is a specified 
picture 1 (SPl), the first fame of the second G0P1 
is a specified picture 2 (SP2), and so forth. The 

30 first frame of the last G0P4 is a specified picture 5 
(SP5). (Each of these specified pictures is an intra- 
frame compressed data.) The frames k-1, k + 6, ... 
k + n of the encoded audio data correspond to SPl, 
SP2, and SP5, respectively. Data showing this. 

35 relation between the SPs of the encoded video 
data, on the one hand, and the frames k-1, k + 6, ... 
k + n of the encoded audio data, on the other hand, 
is contained in the addition data, as can be under- 
stood from FIG. 80. The additional data also con- 

40 tains data representing the sampling numbers of 
the frames k-1, k + 6. ... k + n. Therefore, the addi- 
tional data indicates that SP1 corresponds to the 
frame k-1 of the audio data and has sampling 
number of #615. that SP2 corresponds to the frame 

45 k + 6 of the audio data and has sampling number of 
#12. and that SP5 corresponds to the frame k + n of 
the audio data and has sampling number of #920. 

The means for generating the additional data 
will be described below, with reference to FIG. 9. 

50 FIG. 9 illustrates the means for generating the 

additional data. An original video signal is supplied 
to a terminal 201 . The video signal is quantized by 
quantizing means 202 and input to a frame mem- 
ory 203. The video signals read from the frame 

55 memory 203 are input to image encoding means 
204. The image encoding means 204 encodes the 
signals, generating video data pieces which cor- 
respond to frames. The video data is supplied to a 
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formatter (not shown), which generates video data 
of the format shown In FIGS. 1A, IB and 1C. 
Meanwhile, a specified-picture frame pulse is sup- 
plied to an input terminal 205 and hence to the 
fame memory 203 and the image encoding means 5 
204, serving as a write timing signal and a read 
timing signal for the frame memory 203 and also 
as a timing signal for the image encoding means 
204rA program start pulse is supplied to an input 
terminal 206 and hence to a 1/6 frequency divider io 
207 and also to a speech frame pulse counter 214. 
This pulse clears a 1/6 frequency divider 207 which 
counts the image frame pulse, and generates a 
pulse for a specified-picture ■ frame of the type 
shown in FIG. 8A. Upon receipt of the program is 
start pulse, the speech frame pulse counter 214 
starts counting speech-frame .pulses. 

In the meantime, a speech-sampling pulse is 
supplied to an input terminal 208, and an original 
audio signal is supplied to an input terminal 209. 20 
The original audio signal is sampled and hence 
quantized by sampling/quantizing means 210. The 
output of the sampling/quantizing means 210 is 
input to speech encoding means 211 and encoded 
into audio data. In a device (not shown) connected 25 
to the output of the speech encoding means 211, 
the speech-frame number generated by the speech 
frame pulse counter 214 is added to the header of 
the audio data output from the speech encoding 
means 211. 30 

The speech-sampling pulse supplied to the in- 
put terminal 208 is input to an 1/N frequency 
divider 212 and converted into N speech frame 
pulses, so that each frame of audio data may be 
sampled with N sampling pules. The speech frame 35 
pulses are supplied to the speech encoding means 
211, which encodes the speech data in units of 
frames. The speech frame pulses are supplied, as 
clock pulses, to a speech-sampling pulse counter 
213. Each speech frame pulse clears the speech- 40 
sampling pulse counter 213. The output of the 
speech-sampling pulse counter 213, which repre- 
sents the number of samples extracted from one 
frame of audio data, is input to a register 21 5. The 
speech-frame number is also input to the register 45 
215. The speech-frame number has been gen- 
erated by clearing the speech frame pulse counter 
214 by using a program start pulse and counting 
the speech frame pulses. Input to the register 215 
are the speech-frame number and the number of so 
speech samples. These data items are latched by 
a specified-picture frame pulse and subsequently 
output. The number of speech samples is cleared 
by a speech-frame pulse. Since the number of 
speech samples is latched by the specified-picture 55 
frame pulse while the number is increasing, the 
latched number of speech samples is used as a 
speech-sample number. 
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The additional data output from the register 
215 is used by the formatter 108 to generate a 
data unit of the type shown in FIG. 1A. 

FIG. 10 shows the means for reproducing the 
additional data, thereby to accomplish image- 
speech synchronization. 

The encoded video data, the encoded audio 
data, and the additional data are reproduced, unit 
by unit, from the recording medium (FIG. 7). The 
additional data defines the period during which the 
decoded video data and the decoded audio data 
are to be output. The encoded audio data read 
from the recording medium is Input, unit by unit, to 
an speech buffer 302 via an input terminal 301 as 
shown in FIG. 10. The encoded video data read 
from the recording medium is input, unit by unit, to 
an imager buffer'312 via an input terminal 311. The 
additional data is input to a shift register 322 
through an input terminal 321. 

The encoded audio data is input to frame num- 
ber extracting means 305. too. The encoded audio 
data output from the speech buffer 302 is input to 
speech decoding means 303 and decoded thereby 
in units of frames. The decoded audio data is input 
to a speech block buffer 304. The encoded video 
data output from the image buffer 312 is input to 
image decoding means 313 and decoded thereby 
in units of frames. The decoded video data is input 
to an image frame buffer 314. Blocks of decoded- 
audio data are sequentially stored into the speech 
block buffer 304. 

The speech-frame number extracted by the 
frame number extracting means 305 is input to 
comparator means 323, which compares the 
speech-frame number with the speech-frame num- 
ber extracted from the header of the encoded 
audio data. If the numbers compared are identical, 
the comparator means 323 generates a coinci- 
dence pulse, which is supplied to gate means 324. 
Then, the sample number contained in the addi- 
tional data is output through gate means 324 to the 
preset input of an address counter 325. 

TTie sample number supplied to the address 
counter 326 designates that location in the speech 
block buffer 30 from which the decoded audio data 
is to be read. The coincidence pulse from the 
comparator means 323 is supplied to speech-sam- 
pling pulse generating means 326 and image frame 
pulse generating means 327. In response to the 
coincidence pulse, both pulse generating means 
326 and 327 start performing their functions, 
whereby the audio data is output in synchronism 
with the video data, in accordance with the cor- 
responding sample number whose relationship with 
the video data designated by the additional data. 

If the numbers compared by the comparator 
means 323 are not identical, the comparator means 
323 generates a non-coincidence pulse. Then, the 
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additional data is shifted in the shift register 322 
until the next synchronization data is read into the 
register 322. For example, when the 'Comparator 
means 323 generates a non-coincidence pulse dur- 
ing the process wherein SP1 = k + 1. the addi- 
tional data is shifted in the register 322 until the 
synchronization data SP2 (= k + 6) is read into 
the register 322. The frame number, k + 6. con- 
tained in the additional data is supplied to the 
comparator means 323, which compares this frame 
number with the frame number contained in the 
encoded audio data. If the frame numbers com- 
pared are identical, that is. if the means 323 gen- 
erates a coincidence pulse • during the process 
wherein SP2 = k + 6, then the video data sup- 
plied to the image decoding means 313 and hence 
to the image frame buffer 314 is processed into 
decoded picture data of SP2. This synchronization 
is performed by an adjusting means 328. In this 
case the audio data is output in synchronism with 
the picture data of SP2 et seq. 

The adjusting means 326 recognizes the image 
frame number, too, by using the output of the 
image decoding means 313, 

Neither the video data nor the audio data, or 
only the video data, may be output until the com- 
parator means 323 generates a coincidence signal. 
Once the means 323 has generated a coincidence 
signal, the comparator means 323 may be stopped, 
since the speech in a group of pictures is synchro- 
nous with the image in the same group of pictures. 
The comparator means may be periodically driven, 
each time in response to a specified-picture signal. 

In the case where the speech frame number is 
found to be large when a non-coincidence pulse is 
supplied to the adjusting means 328, the process 
goes to the image frame of SP2 or SP4. Nonethe- 
less, synchronization can be secured before the 
process goes to the image frame of SP3 since 
ordinary speech fames have a length of at most 
2048 samples. 

As described above, the timing of outputting 
video data from the image frame buffer 314 and 
the timing of outputting the audio data from the 
speech block buffer 304 are controlled for the 
purpose of synchronizing any specified-picture 
frame and a designated speech sample. For the 
same purpose, additional means may be used to 
adjust the time for storing decoded data into a 
buffer memory {not shown) or the time for storing 
encoded data into a buffer memory (not shown). 

FIG. 11 shows another type of means for re- 
producing the additional data, thereby to accom- 
plish image-speech synchronization. 

As shown in FIG. 1 1 , encoded video data is 
supplied to an input terminal 401 and decoded by 
an image decoder/frame buffer 402. An internal 
clock signal is supplied to an input terminal 403 
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and is frequency-divided by an 1/M frequency di- 
vider 404 into image frame pulses. These image 
frame pulses are supplied as timing signals to the 
image decoder/frame buffer 402. They are supplied 

5 also to a 1/6 frequency divider 405 and frequency- 
divided into specified-picture frame pulses which 
are synchronous with the specified-picture signals 
shown in FIG. 8A. 

Encoded audio data is input via an input termi- 

70 nal 406 to speech decoding means 407 and is 
decoded thereby. The decoded audio data is input 
to a decoded speech block buffer 408. An internal 
clock signal is supplied through an input terminal 
411 to an 1/N frequency divider 412 and frequen- 

75 cy-divided into speech-sampling pulses. The 
speech-sampling pulses are input to speech-frame 
pulse generating mean ; 41 3 and also to a decoded 
speech-sample, address .counter 414. The .pulse 
generating means 413 generates speech frame 

20 pulses corresponding to speech frames. The 
speech frame pulses are supplied, as timing sig- 
nals, to the speech decoding means 407 and the 
decoded speech-sample address counter 414. 
The decoded speech-sample address counter 

25 414 is reset by a speech frame pulse and counts 
speech-sampling pulses. Hence, the output of the 
address counter 414 represents a speech sample 
number. The speech sample number is used as a 
read address for the decoded speech block buffer 

30 408, and is input to a register 415. The register 415 
latches the speech sample number in response to 
a specified-picture frame pulse. The speech sam- 
ple number, thus latched, is input to comparator 
means 416. The comparator means 416 compares 

35 the speech sample number with the speech sam- 
ple number contained in the additional data sup- 
plied from an input terminal 417. 

if the speech sample numbers compared are 
identical, this means that the video data and the 

40 audio data are synchronous in a prescribed rela- 
tionship. If the speech sample numbers compared 
are not identical, this means that the speech frame 
designated by the additional data is not synchro- 
nous with a specified-picture signal. To render the 

45 speech frame synchronous with the specified-pic- 
ture signal, the comparator means 416 supplies a 
divider-adjusting signal to the 1/N frequency divider 
412, thereby controlling the phase of the speech- 
sampling pulses and that of the speech frame 

50 pulses. In effect, the divider (N) of the 1/N fre- 
quency-divider is increased or decreased by 1 to 
2. As long as the difference between the two 
speech sample numbers compared by the com- 
parator means 416 falls within a predetermined 

55 range, the video data and the audio data are main- 
tained synchronous with each other. 

Instead of adjusting the divider of the 1/N fre- 
quency divider 412, the divider (M) of the 1/M 
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frequency divider 404 may be adjusted in order to 
render the video data and the audio data synchro- 
nous. Alternatively, the dividers of both frequency 
dividers 404 and 412 may be adjusted for the 
same purpose. No mailer whether either the divider 5 
(M) or the divider (N), or both, are adjusted, the 
video data and the audio data can be synchronized 
before they become excessively asynchronous, de- 
spite that the frequency of the encoding clock 
signal differs, though slightly, from the frequency of io 
the decoding clock signal. 

As described above, with the present Invention 
it is possible to record data efficiency, to manage 
data easily, to reproduce programs in a special 
manner and search it at high speed, and to ac- 75 
curately synchronize video data and audio data. 

The present invention is not limited to- the 
embodiment described above. 

FIG. 12 shows a recording/reproducing appara- 
tus which is a second embodiment of the present so 
invention. The reproduction system of this appara- 
tus will be described below. 

A disk 10 is placed on a turntable 501. which is 
rotated by a motor 502. In the reproduction mode, 
■ a pickup means 503 reads the data recorded on 25 
the disk 10. The pickup means 503 is moved to a 
desired track of the disk 10 under the control of a 
driving section 504. An output of the -pickup means 
103 is supplied to a modulation and demodulation 
section 601, which demodulates the supplied sig- 30 
nal. The demodulated data is supplied to an error 
correction data processing section 602, which cor- 
rects errors and supplies the resulting signal to a 
data string processing section 603. The data string 
processing section 603 extracts video data, subtitle 35 
and character data, and audio data. On the disk 10, 
the subtitle and character data and audio data are 
recorded so as to correspond to the video data, as 
explained later. Here, various languages can be 
selected for the subtitle and character data and 4o 
audio data. The selection is made under the control 
of a system control section 604. The user supplies 
the input from an operator section 605 to the 
system control section 604. 

Assuming that information on a movie is re- 45 
corded on the disk 10, a plurality of scenes the 
user can select are recorded. To enable the user to 
select any one of the scenes, the data string pro- 
cessing section 603. the system control section 
604, and the operator' section 605 in the reproduc- so 
ing apparatus constitute data string control means 
and scene select means, in accordance with the 
user's operating of the operator section 605. 

The video data separated at the data string 
processing section 603 Is supplied to a video pro- 55 
cessing section 606. which carries out a decode 
process according to the type of display unit. For 
example, the video data is converted into a suitable 
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form for an NTSC, PAL, SECAM, or wide screen. 
The video signal decoded at the video processing 
section 606 is supplied to an adder 608, which 
adds it with the subtitle and character data and 
supplies the addition result at an output terminal 
609: 

The audio data separated at the data string 
processing section 603 is supplied to an audio 
processing section 511, which demodulates it and 
supplies the demodulated signal at an output termi- 
nal 612. 

The audio processing section acting as a de- 
coding section, which contains an audio processing 
section 613 in addition to the audio processing 
section 61 1 , can also reproduce speech in another 
language and supply this reproduced signal at an 
output terminal 614. 

FIG. 13 illustrates the data string processing 
section 603 (FIG. 12) in more detail. 

The data string processing section 603 is de- 
signed to analyze the header (also known as "sub- 
code") of each data unit, to separate the packets 
contained in the data unit and to supply the pack- 
ets to the respective decoders. 

FIG. 14 shows the various types of data which 
are contained in the header of each data unit. The 
OUT header contains program number, program 
time, data-unit size, the starting position of video 
data, the starting position of audio data, image- 
speech synchronization data, the starting position 
of sub-video data, and the like. The program num- 
ber (i.e.. the number assigned to the program) and 
the program time (I.e., the time required to process 
the data unit of the program) are 2-byte data items. 
The size of the data unit is represented in the 
number of bytes which forms it. The starting posi- 
.tion of the video data is indicated by. the ordinal 
number of the first byte of the video data, counted 
from the starting byte of the data unit. The image- 
speech synchronization data consists of the frame 
number and sample number of the audio data 
which corresponds to a specified picture frame. 
The starting position of the sub-video data is in- 
dicated by the ordinal number of the first byte of 
the sub-video data, counted from the starting, byte 
of the data unit. Three identical sets, each com- 
prised of data-unit size, starting position of video 
data, starting position of audio data, image-speech 
synchronization data, are recorded so that, in case 
one or two set cannot be read or the disk has been 
damaged, the remaining set or sets may be read 
from the disk. In FIG. 14, the symbol "x 3" shows 
that this safety measure has been taken. 

As shown In FIG. 13, the data string processing 
section 603 comprises a DUT header analyzing 
section 701 and a data cache memory 702. The 
section 701 analyzes the DUT header. The data 
unit is stored into the data cache memory 702. The 
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section 701 can determine what kind of data is 
stored at which address in the data cache memory 
702. It can therefore set a read address for the 
video data so that the video data (actually a GOP) 
may be read from the memory 702, separately 5 
from the other component of the data unit. The 
encoded audio data is read from the memory 702, 
also separately from the other component of the 
data unit. To read the audio data, it is necessary to 
supply a channel-designating address data to the io 
data cache memory 702 from the system control 
section 604, since there are provided a plurality of 
channels. The 'encoded expansion data is read 
from the data cache memory 702 in a similar 
manner. 75 

As has been explained, the embodiment can 
record data efficiently, easily manage data, repro- 
duce programs in a special manner and search 
them at high speed. This is because each data unit 
is formed of a header portion, an expansion data 20 
portion, an encoded audio data portion and an 
encoded video data portion, and the header portion 
contains data-unit size, the starting position of vid- 
eo data, the starting position of audio data, the 
staring position of expansion data, image-speech 25 
synchronisation data, and the like. The DUT header 
analyzing section 701 analyzes the header portion 
and determines what kind of data is stored at which 
address in the data cache memory 702. thereby 
setting a read address for the video data so that 30 
any encoded data may be supplied from the mem- 
ory 702 to the decoder, separately from the other 
component of the data unit. 

Safety measures have been taken to the disk, 
particularly to the management information, as will 35 
be explained below. 

As shown in FIG. 15A. the information area of 
the disk 10 has a management area on the inner 
side and a data area outside the management area, 
for example. In the management area, manage- 4o 
ment information needed to access the data in the 
data area is recorded as explained later. In the data 
area, information including a header, sub-video" 
data, audio data, and video data is recorded. 

As shown in FiG. 15B. in the management 45 
area, for example, the identical contents of man- 
agement information are recorded tn the section 
{Pi to P2) of the innermost two and half tracks and 
the next two-and-half track section (P2 to P3). That 
is, the start positions of the identical contents of so 
management information are set on radiating lines 
with different angles on the disk 10. In this embodi- 
ment, the angle that two radiating lines make is 180 
degrees. 

Two sets of management information are re- 55 
corded on the disk 10. Hence, if one of them 
cannot be read from the disk due to dirt, the other 
set of management information can be used. This 
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prevents the important information from being lost 
in accessing the data area. The two sets of man- 
agement information are recorded in' different posi- 
tions on the disk. 

Therefore, even if the disk is scratched or 
stained with foreign matter, there is a very tow 
probability that, for example, positions directly op- 
posite each other with the center of the disk are 
damaged or stained with foreign matter as shown 
by a shaded portion. Accordingly, it is important in 
terms of safety that management information is 
recorded in different angular positions on the disk. 

If the management information cannot be read, 
it is particularly fatal to the reproduction of data 
from the disk. Thus, it is important that at least 
more than one set of the same management in- 
formation is recorded on the disk as- described 
above. Namely, as long as the management in- 
formation can be read, the data on the disk can be 
accessed even if part of the data area is damaged. 
Since some data area may contain unused por- 
tions, recording at least more than one set of 
management information helps improve the reliabil- 
ity of the disk 10. 

When the amount of all data recorded on the 
disk 10 is smaller than the total recording capacity 
of the disk, or when all pieces of the recorded data 
are important, more than one set of the managed 
information in the data area may be recorded, in 
this case, too, the start position of each item of 
information is set on a different radial line. In the 
embodiment described above, the recording start 
positions differ from each other by an angle of 180 
degrees. The angular difference is not restricted to, 
this. For instance, it may be 90 or other degrees. 
While in the embodiment, two sets of the same 
data are recorded, three or four sets of the same 
data may be recorded. 

What types of data are recorded in the data 
area will be described. 

FIG. 16 is an enlarged view of the contents of 
data unit DUT #0 in the data area. In data unit DUT 
#0. there is a subcode (SUB-CODE) at the start, 
followed by a sub-picture (SUB-PICTURE), audio 
data (AUDIO), video data (VIDEO) in that order. 
The subcode (SUB-CODE) contains the attributes 
of data unit DUT #0 and control data on the data 
unit. The sub-picture (SUB-PICTURE) contains 
subtitle data (for movie video) or character data (for 
kara-OK video and educational video), for example. 
The subtitle data and the character data are each 
given PICTURE #0 to #7, ail of which or some of 
which differ from each other in language and the 
rest contain no signals. The audio (AUDIO) data is 
recorded in up to eight different languages AUDIO 
#0 through #7 (each reproduction lasts approxi- 
mately one second). Each piece of audio data is 
recorded in frames, each frame, #0, #1, and so on 
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being composed of headers (HEADERS) and data 
(DATA). The video data (VIDEO) contains 30 
frames of images (approximately one second of 
reproduction), for example. The video (VIDEO) for- 
mation is recorded by high -efficient Image encod- 
ing compression techniques. The number of frames 
is not limited by standards. 

As described above, different languages are 
recorded on the disk, and at least two decoders for 
speech reproduction are incorporated in the re- 
producing apparatus. Hence, at least tv^o of the 
languages can be combined in the apparatus. For 
expensive models, more video decoders, more 
speech decoders, and more subtitle and character 
data decoders may be used. 

An example of the management information 
recorded in the management area will be ex-, 
plained. The management information is stored in 
the form of a table. 

As shown in FIG. 17A. a table of language 
codes is recorded in the VID, showing what lan- 
guage is recorded in which data area. The lan- 
guage code -correspond to description codes 0. 1. 
.... 8. In this example of a disk, the description 
code 0 corresponds to non-language, or back- 
ground sound and music (B & M). and the descrip- 
tion codes 1. 2, 3. and 4 correspond to English, 
Japanese, French, and German, respectively. The 
correspondence between each description code 
and each language code is known when the VID is 
read at the start of the reproducing apparatus. 

On the other hand, bit data strings are defined 
in the PIF table. Specifically, description codes 
correspond to data string numbers #0 through #7 
on the disk (FIG. 17B). When a data string number 
is selected, a description code is determined and 
the language code corresponding to the description 
code is also determined. 

Therefore, when the reproducing apparatus 
reads the data in the PIF table, it displays the first 
menu screen in accordance with data string num- 
bers #0 to #7 (a display by the key display signal). 
This display is effected by, for example, supplying 
a language code to a conversion table to generate 
the display data corresponding to each language 
code. To supply the code of a language the user 
can understand, the user only needs to select and 
input the corresponding data string number by 
operating the operator section. 

For example, when the user selects the data 
string number #0, the description code 1 is dis- 
played. At this time, D1 (i.e.. English) is selected 
for speech. When the user selects the data string 
number #2, D2 (Japanese) is selected for speech. 

After the user has selected a language, a pro- 
ducer's comment is displayed in the language se- 
lected. The data address at which the comment 
information is recorded is recorded in, for example, 
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the VID table. The comment data is displayed in 
the language the user can understand. For exam- 
ple, it is displayed on the second menu screen, in 
the language which the user has selected at the 

5 first menu screen. If the user has selected #2 at 
the first -menu screen, comments is displayed in 
Japanese. The comments include a greeting from 
the producer, the date of production, the intention 
of the product, and the program time in the case of 

70 movies, for example. Seeing these comments dis- 
played, the user can select an output mode for 
speech and subtitles, by pushing the speech and 
subtitle change button provided at the operator 
section. When the user pushes the speech change 

75 button, a cursor appears on the screen. Each time 
the speech change button is pressed, the cursor 
moves from one item to another in the language 
column, from non-language to Japanese, English, 
French, German, and son on, in the language col- 

20 umn. Upon lapse of a predetermined time after the 
cursor has been moved to the desired item, the 
desired item is selected unless the button is push- 
ed during that predetermined time. The subtitle 
change button is similarly operated, to select the 

25 subtile in the desired language. 

When neither the speech select button nor the 
subtitle change button has not been operated for 
the predetermined time, the reproduction mode in 
the speech selected at the first menu screen will 

30 be effected. The speech output mode and the 
subtitle display mode can be changed during op- 
eration of the reproducing apparatus. 

When one of the programs is selected, that is, 
when a data string is selected, the system control 

35 section of the reproducing apparatus controls the 
pickup-driving section. The pickup-driving section 
moves the pickup, which reads the selected pro- 
gram from the disk. 

As may be understood from the above, the 

40 management information is extremely important in 
accessing to the disk. Could the management in- 
formation not be read, it would be fatal to the 
reproduction of data from the disk. 

FIG. 18 is a flow chart explaining how the data 

45 string processing section 603 process the signals 
supplied to it via the error correction data process- 
ing section 602. The section 603 receives the data 
supplied from the section 602 and determines 
whether or not the data contains errors. 

50 More specifically, the first management infor- 

mation is read frona the disk (Steps Sil and 812). 
Then, the data string processing section 603 deter- 
mines whether or not the information contains er- 
rors (Step Si 3). If NO, the information is stored as 

55 a management table into the work memory incor- 
porated in the system control section 604 (Step 
S20). If YES, the section 603 determines whether 
or not the errors can be corrected, for example by 

12 
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counting the number of the errors (Step Si 4). !f 
YES in Step Si 4, the errors are corrected (Step 

51 9) . The information, thus corrected, is stored as 
a management table into the work memory. If NO 
in Step Sl4, the second management information 
is read from the disk (Step 15). Next, the data 
string processing section 603 determines whether 
or not the information contains errors (Step S16). If 
NO in Step 16, the information is stored as a 
management table into the work memory incor- 
porated in the system control section 604. (Step 

520) . If YES in Step Si 6, the section 603 deter- 
mines whether or not the errors can be corrected, 
for example by counting the number of the errors 
(Step S17). If YES in Step S17, the errors are 
corrected (Step Si 9). The data, thus corrected, is 
stored as a management table into the work mem- 
ory. IF NO in Step Si 7, a warning is displayed 
(Step 18). 

The embodiment described above, it is deter- 
mined whether the first management information is 
valid or invalid. If the first management information 
is invalid, the second management information is 
read from the disk. Instead, both the first manage- 
ment information and the second management in- 
formation may be read from the disk and simulta- 
neously be examined for errors. In this case, if 
errors are found in a part of one of the manage- 
ment Information items, that part is automatically 
replaced by the corresponding part of the other 
management information, thereby removing errors, 
and the error-free management information is 
stored into the work memory. 

As has been described, the reproducing ap- 
paratus and the disk, both according to the present 
invention, can minimizing the possibility that impor- 
tant information to be used in the apparatus is 
destroyed completely when the disk is damaged. 

Claims 

1. An apparatus for generating compressed sig- 
nals, comprising: 

image grouping/compressing means (103, 
106) for processing video data into groups, 
each consisting of video data items or frames 
which correspond to a predetermined repro- 
ducing time, and for compressing and encod- 
ing the video data items of each group; 

speech grouping/compressing means (102, 
105) for processing audio data corresponding 
to the video data into groups, each group 
consisting of audio data items, and for com- 
pressing and encoding the audio data items of 
each group; and 

a formatter (108) for combining a plurality 
of groups of compressed and encoded video 
data items, which have been supplied from 
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said image grouping/compressing means, into 
a video-data packet, for combining a plurality 
of groups of compressed and encoded audio 
data items, which have been supplied from 

5 _ said speech-grouping/compressing means, into 
an audio-data packet, for combining at least 
the video-data packet and the audio-data pack- 
et into a data unit, and for supplying the data 
unit to a recording system or a transfer sys- 

70 tern, 

2. The apparatus according to claim 1, further 
comprising sub-video data group- 
ing/compressing means (105, 107) for process- 

75 Ing sub-video data into groups, each consisting 

of sub-video data items which correspond to 
the video data items of a group and for com- 
. pressing -and encoding the sub-video data 
items of each group, and said formatter (108) 

20 inserts the groups of encoded sub-video data^ 

items into said data unit. 

3. The apparatus according to claim 1. character- 
ized in that said data unit consists of a sub- 

25 code, a packet of encoded sub-video signal 

data, a packet of encoded audio data and a 
packet of encoded video data, which are ar- 
ranged in the order mentioned along a time 
axis, said subcode contains synchronization 

30 data representing correspondence of a group 

of encoded audio data items and a group of 
encoded video data items, and a first item of 
the sub-video data items of each group, a first 
item of the audio data items of each group and 

35 a first item of the video data items of each 

group is each a top item before the group is 
formed. 

4. The apparatus according to claim 3, character- 
40 ized in that said image grouping/compressing 

means processes video data into five groups, 
each consisting of six frames. ' 

5. The apparatus according to claim 3, character- 
45 ized in that said image grouping/compressing 

means processes video data into three groups, 
each consisting of twelve frames. 

6. The apparatus according to claim 3, character- 
so ized in that said image grouping/compressing 

means processes video data into three groups, 
each consisting of ten frames. 

7. An apparatus for reproducing compressed sig- 
55 nals from a data unit (OUT) comprised of video 

data groups each consisting of compressed 
and encoded video data items or frames which 
correspond to a predetermined reproducing 
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time, audio data groups corresponding to the 
video data groups, each consisting of conn- 
pressed and encoded audio data items, sub- 
video data groups corresponding to the video 
data groups, each consisting of compressed 5 
and -encoded sub-video data Items, said ap- 
paratus comprising: 

a speech decoder (122) for separating the 
audio data groups from- the data unit and de- 
coding the audio data groups, thereby generat- - la.. 
ing decoded audio data; 

an image decoder (123) for separating the 
video data groups from the data unit and de- 
coding the video data groups, thereby generat- 
ing decoded video data; is 

a sub-image decoder (124) for separating 
the sub-video data from the data unit and 
decoding the sub-video data, thereby generat- 
ing decoded sub-video data; and 

data synthesizing means (125) for combin- 20 
ing decoded the decoded video data gener- 
ated by said image decoder and the decoded 
sub-video data generated by said sub-image 
decoder. 

25 

8. An apparatus for managing compressed sig- 
nals, comprising memory means for storing 
program information recorded on a recording 
medium, said program information forming a 
data allocation table which consists of numbers 30 
assigned to tracks on the recording medium, 
numbers assigned to zones forming each 
track, numbers assigned to sectors forming 
each track, and a link pointer of a data unit to 

be reproduced next. 35 

9. The apparatus according to claim 8. character- 
ized in that the program information comprises 
data units and numbers assigned to the data 
units, each data unit consisting of: 40 

video data groups each consisting of com- 
pressed and encoded video data items or 
frames which correspond to a predetermined 
reproducing time; 

audio data groups corresponding to the 45 
video data groups, each consisting of com- 
pressed and encoded audio data items; 

sub-video data groups corresponding to 
the video data groups, each consisting of com- 
pressed and encoded sub-video data items; so 

10. An apparatus for synchronizing compressed 
signals, comprising: 

an encoder section which comprises: 
image grouping/compressing means (202, 55 
203. 204) for encoding a predetermined num- 
ber of image frames which corresponds to a 
predetermined reproducing time of an original 



image, thereby generating encoded video data 
items, and for combining the encoded video 
data items into a video packet; 

speech grouping/compressing means (210, 
21 1 ) for processing encoded audio data . cor- . 
responding to the packet of encoded video 
data items, thereby generating speech frames, 
and for combining the speech frames into a 
audio packet; 

additional data generating means (212, 
213, 214, 215) for generating additional addi- 
tional data consisting of a speech frame num- 
ber assigned to that speech frame included in 
said audio packet which represents an original 
speech corresponding to a start timing of a 
specified image frame included in the video 
packet; and 

a formatter for combining the additional 
data, the audio packet and the video packet 
into a data unit, and 
a decoder section which comprises: 

decoding means (312, 313, 302. 303) for 
decoding the encoded video data, encoded 
audio data and additional data of each data 
unit; and 

output timing setting means (305, 322, 
323. 324, 325, 304. 327, 314) for setting timing 
of dutputting a first specified image frame, 
when a speech frame number contained in 
said encoded audio data coincides with a 
speech frame number contained in said addi- 
tion data. 

11. The apparatus according to claim 10, char- 
acterized in that said additional data generating 
means comprises: 

pulse generating means (207) which is 
cleared by a program start pulse and which 
generates a specified image frame pulse used 
as an image frame pulse for the original image, 
at a time corresponding to the start of said 
specified image signal; 

first frequency dividing means (212) for 
frequency-dividing speech sampling pulses for 
sampling the original speech, thereby to ob- 
taining a speech frame puise; 

a speech sampling pulse counter (213) 
which is cleared by the speech frame pulse 
supplied from said first frequency dividing 
means and which counts the speech sampling 
pulses; 

a speech frame pulse counter (214) which 
is cleared by the program start pulse and 
which counts the speech frame pulses; and 

a register (215) for latching a count value 
from said speech sampling pulse counter as 
the speech sample number and a count value 
from said speech frame counter as the speech 



14 



3NSOOC1D: <EP 0&d-4692A2_l_> 




27 EP 0 644 

frame number, in response to the specified 
image frame pulse, and for outputting the the 
speech sample number and the the speech 
frame number. 

5 

12. The apparatus according to claim 10, char- 
acterized in that said output timing setting 
means comprises: . 

means (305) for extracting the speech 
frame number from the encoded audio data; to 

comparing means (323) for comparing the 
speech frame number extracted from the en- 
coded audio data with the speech frame num- 
ber contained in the additional data; 

means (324, 325) for presetting the speech is 
sample number contained in the additional 
data into an address counter, in response to a 
coincidence pulse generated by said compar- 
ing means; 

means (304, 326) which starts reading de- 20 
coded audio data from a speech block buffer 
in response to ah address supplied from said 
address counter; and 

means (314. 327) which starts reading de- 
coded video data from an image block buffer 25 
in response to the coincidence pulse. 

13. An apparatus for synchronizing compressed 
signals, comprising: 

an encoder section which comprises: 30 

image grouping/compressing means (202, 
203, 204) for encoding a predetermined num- 
ber of image frames which corresponds to a 
predetermined reproducing time of an original 
image, thereby generating encoded video data • 35 
items, and for combining the encoded video 
data items into a video packet; 

speech grouping/compressing means (210, 
211) for processing encoded audio data cor- 
responding to the packet of encoded video 40 
data items, thereby generating speech frames, 
and for combining the speech frames into a 
audio packet; 

additional data generating means (212, 
213, 214, 215) for generating additional addi- 45 
tional data consisting of a speech frame num- 
ber assigned to that speech frame included in 
said audio packet which represents an original 
speech corresponding to a specified image 
frame included in the video packet; and 50 

a formatter for combining the additional 
data, the audio packet and the video packet 
into a data unit, and 
a decoder section which comprises: 

image decoding means (402) fro decoding 56 
the encoded video data of each data unit into a 
decoded video data; 

a frame buffer (402) for storing the de- 
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coded video data of each data unit, supplied 
from said image decoding means; 

speech decoding means (407) for decod- 
ing the encoded audio data of each data unit 
into decoded audio data; 

a speech block buffer (408) for storing the 
decoded audio data of each data unit, supplied 
from said speech decoding means; 

first frequency dividing means (404) for 
frequency-dividing an internal clock signal, 
thereby generating an image frame pulse de- 
fining a timing of outputting data from said 
frame buffer; 

a second frequency dividing means (412) 
for frequency-dividing the internal clock signal, 
thereby generating a speech sampling pulse 
■ and a speech frame pulse; 

a decoded speech sample address coun- 
ter (414) which is reset by the speech frame 
pulse, which counts speech sampling pulses 
and which generates a read address for said 
speech block buffer; 

a register (415) for latching the read ad- 
dress generated by said decoded speech sam- 
ple address counter, in response to a specified 
image frame pulse obtained by frequency-di- 
viding the image frame pulse; and 

means (416) for comparing the comparing 
a speech sample number contained in the ad- 
ditional data with the address supplied from 
the register, and for performing synchroniza- 
tion adjustment on the image frame pulse un- 
der control of said first frequency dividing 
means or on the decoded speech sampling 
pulse under control of said second frequency 
dividing means, when a difference between the 
speech sample number and the address is 
equal to or greater than a predetermined value. 

14. An apparatus for reproducing compressed sig- 
nals from a data unit containing encoded video 
data generated by compressing and encoding 
a plurality* of image frames, encoded audio 
data generated by compressing and encoding 
a plurality of speech frames corresponding to 
the image frames, and a data unit header 
representing a position of the encoded vide 
data and a position of the encoded audio data, 
said apparatus comprising: 

memory means (702) for temporarily stor- 
ing at least the encoded video data and the 
encoded audio data contained in the data unit; 
and 

control means (701) for analyzing the data 
unit header contained in the data unit, rec- 
ognizing an address in said memory means, at 
which the encoded video data and the en- 
coded audio data are stored, and supplying the 
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encoded video data and the encoded audio 
data from said memory means to an Image 
decoder and a speech decoder, respectively. 

15. The apparatus according to claim. 14, char- s 
- acterized in that said data unit header contains 

identical sets of data, each set consisting of 
data representing a size of the data unit, data 
representing a starting position of the encoded 
video data and data representing a starting io 
position of the encoded audio data. 

16. The apparatus according to claim 14, char- 
acterized in that said encoded audio data is 

one for a plurality of channels, and said control 75 
means (701) selects a channel while supplying 
the encoded audio data to the speech de- 
coder. 

17. The apparatus according to claim 16, char- 20 
acterized in that said control means (701) sep- 
arates the encoded audio data in accordance 

with a channel designating signal supplied 
from an external device, in order to select a 
channel. 25 

18. An apparatus for reproducing compressed sig- 
nals from a data unit containing a header, an 
encoded sub-video data, an encoded audio 
data and an encoded video data, said header 30 
portion containing data representing a size of 

the data unit, data representing a starting posi- 
tion of the encoded video data and data repre-' 
senting a starting position of the encoded 
audio data, said apparatus comprising: 35 

data cache memory (702) for storing the 
data unit; and 

header analyzing means (701) for analyz- 
ing the data of the header, recognizing ad- 
dresses in said data cache memory, at which 40 
the encoded audio data, the encoded video 
data and the encoded sub-video data are 
stored stored, and designating read addresses, 
thereby to supply the encoded audio data, the 
encoded video data and the encoded sub- 45 
video data to different decoders, respectively. 

19. A disk structure having a management area on 
a central portion and a data area surrounding 

the management areas, wherein identical man- 50 
agement data items are recorded in the man- 
agement areas, data to be accessed based on 
the management data item is recorded in the 
data area, and starting positions of the identical 
management data items are set on different . 55 
radial lines spaced apart by different angles. 



20. An apparatus for reproducing data from a disk 
having a management area on a central portion 
and a data area surrounding the management 
areas, with identical management data items 
recorded in the management area, and data to 
be accessed based on the management data 
item recorded in the data area, said apparatus 
comprising: 

first data-checking means (Sll to S14) for 
reading a first management data from the man- 
agement area when the apparatus is started 
and checking the first management data item, 
thereby determining whether or not it is possi- 
ble to use the first management data item; 

data-reading means (SlO, S20, SI 5) for 
storing the' first management data item into a 
work memory when said first data-checking 
means determines that it is impossible to use 
the first management data item, and for read a 
second management data item from the man- 
agement area; 

second data-checking means (S16. S17) 
for checking the second management data 
item, thereby determining whether or not it is 
possible to use the second management data 
item; and 

warning-generating means (S19. S20, SI 5) 
for storing the second management data item 
into the work memory when said second data- 
checking means determines that it is impos- 
sible to use the second management data 
item, and for generating a warning. 
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(54) Video signals compression/decompression device for video disk recording/reproducing 
apparatus 

(57) An apparatus for generating compressed sig- 
nals, which can serve to record data with high efficiency, 
to manage data easily, to reproduce programs in a spe- 
cial manner, to search for data at high speed, and to 
achieve accurate image-speech synchronization. The 
apparatus comprises a video-data grouping device (103) 
for processing video data into groups, each consisting of 
a predetermined number of video data items corre- 
sponding to image frames, a video-data compressing 
device (106) for compressing and encoding the video 
data items of each group, an audio-data grouping device 
(102) for processing audio data corresponding to the 
video data into groups, each group consisting of audio 
data items, an audio-data compressing device (1 05) for 
compressing and encoding the audio data items of each 
group, a sub-video data grouping device (104) for 
processing sub-video data Into groups, each consisting 
of sub-video data rtems, a sub-video data conpressing 
device (107) for compressing and encoding the sub- 
video data items of each group, a formatter (108) for 
combining the groups of compressed video data items, 
the groups of compressed audio , data items and the 
groups of compressed sub-video data items, thereby 
generating a data unit, data-separating device (121) for 
separating the compressed data items, a speech 
decoder (122) for decoding the encoded video data 
items, an image decoder (123) for decoding the encoded . 
video data items, a sub-image decoder (124) for decod- 
ing the encoded sub-video data items, and a synthesiz- 
ing device combines the decoded video data items with 
the decoded sub-video data items. 
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